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Abstract. Most of atoms and molecule found in nature are capable of 
evolving towards and staying at their ground states, the lowest energy 
states. This paper offers a global optimization approach to understand 
the ground state as the equilibrium point of an n-player game under 
cooperation. With the same approach, Nash equilibrium can be viewed 
as the equilibrium point under competition. The former can offer higher 
payoffs and stability of game playing than the later. It is truly an in- 
spiration from nature for us to build societies for quality and stability 
under cooperation rather than competition. 



1 Introduction 

The quantum world has many bizarre behaviors |1I2) , such as wave-particle dual- 
ity and Heisenberg uncertainty principle. One of them is why most of atoms and 
molecule found in nature are capable of evolving towards and staying at their 
ground states, the lowest energy states. Otherwise, no atom and no molecule 
will be stable in nature and the world would be completely out of order. 

The ground states, and the stationary states in general, are described by 
Schrodinger equation in quantum mechanics [5]. However nobody knows any 
deeper principle underlying the equation. This paper attempts to show that the 
equation can be derived from a global optimization algorithm, called cooperative 
optimization [3]. Indeed, from a computational point of view, each atomic system 
needs to follow a global optimization process so that it can find its ground state, 
the lowest energy state. 

Cooperative optimization describes a system with multiple agents, each with 
its own objective, working together under cooperation to make a joint decision 
to optimize the collected objectives of individuals. The same language of math- 
ematics can be used to describe a molecule of multiple atoms and a game of 
multiple players. Cooperative optimization offers a mathematical framework to 
understand the computational properties of multi-agent systems under different 
competition and cooperation strategies. 

Specifically, if every agent in a multi-agent system accepts only the best ac- 
tion and rejects all sub-optimal ones for the agent, it leads to a direct competition 
among the agents. If an equilibrium under this kind of competition is reached, 
it will be shown in Section 3 that it is a Nash equilibrium \A^ defined in game 



theory. Nash equilibrium is a key concept in game theory to understand games 
with rational players. 

In contrary to the above scenario, if every agent is less aggressive at compe- 
tition and is willing to accept sub-optimal actions to some degree, it leads to a 
cooperation among the agents. If an equilibrium under this kind of cooperation 
is reached, it will be shown in Section 2 that it is a stationary state described 
by Schrodinger equation in quantum mechanics. Compared with competition, 
cooperation can greatly increase the possibility for a multi-agent system to find 
its global optimal equilibrium state. 

Taking a molecule with multiple atoms as an example, if every atom is ex- 
tremely aggressive at fighting for the best location in terms of potential energy 
described by classical physics, the molecule follows a local optimization process 
and will get stuck into one local minimum energy state or another, rather than 
the global one. Therefore, the molecule is not stable because it may have an 
enormous number of local minimal energy states and its final state can be any 
one of them, sensitive to its initial configuration and perturbations during the 
process. 

Contrary to that, if every atom accepts sub-optimal locations to a certain 
degree while keeping the best location of the highest acceptance, the molecule 
tends to follow a global optimization process and evolves towards the lowest en- 
ergy state (the global minimal energy state). In this case, the molecule becomes 
stable, in-sensitive to its initial configuration and perturbations during the pro- 
cess. At the end, all the atoms in the molecule jointly get the best possible 
locations both in a global sense and in an average sense. 

If each atom accepts only the best location, it will have a precise location 
in space at any given time instance. That is the picture of classic physics at 
understanding the world. Different from that, if each atom accepts all locations, 
the best and sub-optimal ones, proportional to their goodness, its location will be 
spread in space with a probability-like distribution. That is exactly the physical 
reality of the quantum world. 

Accepting sub-optimal actions as a generic decision-optimization strategy is 
truly an inspiration of the quantum world. It offers the definition of another type 
of equilibrium points of n-player games besides Nash equilibrium. The quantum 
world suggests us that, at a Nash equilibrium, if every player gives away some 
payoff by accepting sub-optimal actions to some degree, each of them may actu- 
ally receive a better return at a new equilibrium point than the original one. At 
the same time, the game playing may become more stable since the number of 
its equilibrium points can be; reduced remarkably. Sometimes, it may be reduced 
to a single one corresponding to the social optimum. 

2 A globalization approach to quantum mechanics 

Given a society with n individuals, assume that the objective of individual i 
is described as minimizing a function Ei(x). A simple form of cooperative op- 
timization is defined as an iterative computation of each individual's expected 



returns described by a function 'Fi{xi,t), ior i — 1,2, ... ,n, as follows: 

where stands for the summation over all variables except Xi. his a constant 

of a small positive value. Pi{xi, t) is a probability-like function for picking action 
Xi proportional to {^i{xi,t))" (a > 0), i.e., 

p,{x„t)^{^,{x„t)f /Z,{t) , (2) 

where Zi{t) is a normalization factor defined as Zi{t) — {^i{xi,t))" . 

The larger the parameter a is, the more aggressive each individual is at 
minimizing his own objective function Ei{x). At the same time, the game tends 
to have more equilibrium points. However, the chance for the society to reach 
the social (global) optimum is only peaked at a certain value of a, neither too 
large nor too small. In this case, each individual in the game compromises his 
best action by accepting sub-optimal actions to some degree, different from the 
case when a — oo where only the best action is accepted (see Eql2|). 

In particular, when a = 2, the simple general form ([1]) in a continuous-time 
version is 

d%lJi(xi,t) 1 



dt Z,{t) 

where 



ei{xi)ip^{xi,t) . (3) 



Following the notation from physics, denote ipi{xi,t) as a vector | ipi{t)). Let 
Hi be a diagonal matrix with diagonal elements as ei{xi). Then the equation 
becomes 

-?^|lV'.(0) = ^i?. I^.W) ■ (4) 

The above equation can be further generalized with a hermitian matrix Hi. 

The expected return function ipi{xi, t) is also called a wavefunction in physics. 
It is important to note that the equation Q is the dual equation of the Schrodinger 
equation, where —1 is replaced by the imaginary unit i and the normalization 
factor Zi{t) is not required since the equation is unitary. When the dynamic 
equation Q reaches a stationary point (equilibrium) , the equation becomes the 
time-independent Schrodinger equation: 

\i I ^l)i(xi,t)) = Hi I 'ijji{x^,t)) , 

where can only be any one of the eigenvalues of Hi. 

Theorem 1. When the parameter a = 2, the global optimization algorithm (QP 
in a continuous-time version becomes a dual equation of the Schrodinger equation 
in quantum mechanics. It falls back to the time-independent Schrodinger equation 
whenever an equilibrium point is reached. 



3 Prom quantum mechanics to game theory 



Let Ui{x) = be the utility function for the player z in a game of 

n players. In this case, the player i tries to maximize his utility function Ui{x) 
instead of minimizing his objective function Ei (x) . Both tasks are fully equivalent 
to each other. In this case, ^ becomes as 

^^{x„t) = U,{x)\[pj{Xj,t-l) , (5) 

where = (!f',(a;„t))"/^(tZ',(a;„t))" . (6) 

Xi 

In ([6]), when a — ^ oo, the best action Xi for the player i has a non-zero 
probability while others have probability zero. That is, the player only accepts 
the best action, the one with the highest payoff Ui{xi,p^i). In this case, the 
player is completely selfish. 

If the value of a is reduced from the above extreme case, the player i starts to 
accept sub-optimal actions by assigning non-zero probability to them. The degree 
of the acceptance increases with further decrease of a. At another extreme case, 
when a — > 0, each action is assigned with the same probability and the player 
has no preference on any one of the actions. All of the actions are treated equally 
and they are sampled uniformly. In this case, the player is completely selfishless. 

In summary, the parameter a is kind of describing the selfishness level of 
player i. It covers the spectrum ranging from complete selfishness {a oo) to 
complete selfishlessness (a = 0). 

Theorem 2. Based on Brouwer fixed point theorem, an equilibrium point always 
exists for the set of equations ^ given any value for the selfishness level a. It 
is still true even if each player i in the game has his own selfishness level Ui, 
possibly different from the rest. 

From Eq. ([6]), it is straightforward to prove that the point is also an e- 
approximate Nash equilibrium, where e is inversely proportional to a. An e- 
approximate Nash equilibrium is a strategy profile such that no other strategy 
can improve the payoff by more than the value e. In particular, a Nash equilib- 
rium [3] can be viewed as an 0-approximate one. 

Theorem 3. When the selfishness level a is sufficiently large, i.e., a oo, any 
equilibrium point for the set of iterative equations 0^ can be arbitrarily close to 
a Nash equilibrium and vice versa. 

Some experimental results are given in the following section to demonstrate 
the improvement on the average individual payoff and the stability of game 
playing by reducing the selfishness level. 



4 Experimental Results 



An example payoff matrix of the prisoner's dilemma is given as follows: 



Cooperate Defect 



Cooperate 
Defect 



3,3 


1,4 


4,1 


2,2 



Payoffs of prisoner's dilemma under different selfishness levels are shown in 
Fig. [TJ When the selfishness level a reduces to one (a = 1), the payoff for each 
player has an 20% improvement over the one at the Nash equilibrium (a = oo). 
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Fig. 1. Payoffs of prisoner's dilemma under different selfishness levels. 



A 2-player game used in other game theory literatures has the following payoff 
matrix: 



/2, 3 


-1, 4 


2, 4 


5, 2 


1,-1\ 


2, 2 


3, 


4, 1 


-2, 4 


1, 3 


4, 6 


7, 2 


2,-2 


4, 9 


2, 1 


9, 


-2, 6 


6, 3 


7, 


0, 5 


V3, 2 


6, 1 


2, 5 


5, 3 


1, 0/ 



This game has only one mixed Nash equilibrium. Payoffs of the game under 
different selfishness levels a are shown in Fig. [2] At a = 7, the payoff of the row 
player has an 19% improvement over the one at the Nash equilibrium and an 
8.4% improvement for the column player at the same time. 

Computer-generated societies are also used to test the impact of the self- 
ishness level a on the payoffs of players and the stability of the societies. In 
each computed generated society, each individual has a number of neighbors 
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Fig. 2. Payoffs of two players with 5 actions under different selfishness levels. 

and his payoff function is defined by the summation of the pairwise joint actions 
of himself and his neighbors as follows 



where J\f{i) is the set of the individual i's neighbors. Each function value fij {xi , Xj ) 
is uniformly sampled from the interval [—0.6,0.4]. The neighbors of each indi- 
vidual are randomly picked from the entire population. The overall payoff of the 
society is defined as ui(a;) -\- U2{x) -\- ■ ■ ■ -\- Un{x). 

In the first experiment, a society of 1, 001 individuals is generated where each 
one has 10 actions and 50 neighbors on average. 300 Nash equilbria [a — oo) 
are discovered together with 300 ones under the selfishness level a — 30. Fig. [3] 
shows the overall payoffs of the first 300 ones versus the second 300 ones. From 
the figure we can see that, reducing the selfishness level can lead to remarkable 
improvements both in the overall payoff and the stability (the stability here is 
defined as the fluctuation of the overall payoffs of the equilibrium points). At 
the same time, the quality of the overall payoffs has reached a new high level, 
where the worst of the 300 equilibria under the selfishness level a = 30 is still 
better than the best of the 300 Nash equilbria in terms of the overall payoff. 

A less selfish society can be more efficient than a completely selfish society 
in terms of finding an equilibrium with a good overall payoff. To compare the 
efficiency, a society of a population of 121 individuals with 50 actions for each 
and 6 neighbors on average is generated. When the individuals in the society are 
less selfish (a — 20), the average overall payoff of the 300 equilibria found by 
the society is 187. When they are completely selfish, after exploring one million 
of equilibria, the best overall payoff is 185.9, still less than the former one (see 
Fig. 14]) . The less selfish society spent seconds on average to find an equilibria 
while the completely selfish society took almost a whole day to find the one 




for i = 1, 2, . . . , n 
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Fig. 3. For a computer generated society of 1,001 individuals, when the in- 
dividuals are completely selfish (a = oo), 300 equilibria are found with their 
overall payoffs shown as the bottom connected lines. When they are less selfish 
{a = 30), 300 equilibria are also found with their overall payoffs shown as the 
top connected lines. When a = oo, the average value of the overall payoffs is 
243.62 and the variance is 3, 335. When a = 30, the corresponding values are 
647.43 and 818. 



million equilibria using a laptop with a AMD TurionT'^X2 Dual-Core Mobile 
Processor and 3GB RAM. The former is much more efficient than the latter. 



5 Conclusion 

Accepting sub-optimal actions is a general decision-optimization principle for 
cooperation, ft defines a global optimization approach to understand quantum 
mechanics. It also offers a strategy for improving social stability and individual 
payoffs over the classic profit-maximization principle. 

The fundamental principle of rational decision making in classic game the- 
ory is to maximize the payoff by each player in a game. The logical justification 
of this principle seems obvious which shapes the definition of Nash equilibrium 
more than 50 years ago. However, the study presented in this paper shows that 
the optimality of this principle is conditional. The optimal decision of each player 
in the classic sense may not lead to a good payoff for the player. Defying the 
conventional wisdom, compromising it by accepting sub-optimal actions can im- 
prove both the overall payoff, equivalently the average individual payoff, and the 
stability of game playing. This study suggests that, for the benefit of everyone 
in a society (or a financial market), the pursuit of maximal payoff by each in- 
dividual should be controlled at some level either by voluntary good citizenship 
or by imposed regulations. 



188 




176 



174 - 

172 J , , , , 

200000 400000 600000 800000 1000000 

The Number of Equilibria 

Fig. 4. After exploring one million equilibria by a society of 121 completely 
selfish individuals (the solid line), the best one in terms of the overall payoff 
still couldn't match the single trial (averaged) by the same society when all the 
individuals are less selfish (the dotted line). 
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